7-3 Time-domain: PDF: NSDF

The range of ACF is usually not known in advance. To limit the range of ACF to [-1, 1], we can use the following NSDF (normalized squared difference function) formula: $$nsdf(\tau)=\frac{2\sum s(i)s(i+\tau)}{\sum s^2(i)+\sum s^2(i+\tau)}$$ All the summations in the above equation should have the same lower and upper bounds. The range of NSDF is [-1, 1] due to the following inequality: $$-1 \leq \frac{2xy}{x^2+y^2} \leq 1$$ If the selected pitch point is $\tau=\tau_0$, then we define the clarity of this frame is $$clarity=acf(\tau_0)$$ A higher clarity indicates the frame is closer to a pure periodic waveform. On the other hand, a lower clarity indicates the frame is less periodic, which is likely to be caused by unvoiced speech or silence. The following is a typical example:
Example 1: frame2nsdf01.m

The following example uses NSDF to perform pitch tracking:
Example 2: ptByNsdf01.m

We can increase the frame size to reduce pitch-halving errors:
Example 3: ptByNsdf03.m

Reference: McLeod, Philip, and Geoff Wyvill. "A smarter way to find pitch." Proceedings of International Computer Music Conference, ICMC. 2005.
Audio Signal Processing and Recognition (音訊處理與辨識)